home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
MacWorld 1998 October
/
Macworld (1998-10).dmg
/
Shareware World
/
Utilities
/
Text Processing
/
A Sort of a Kind
/
Go aSoaK Your Head!
< prev
next >
Wrap
Text File
|
1998-08-01
|
6KB
|
176 lines
RELEASE NOTES FOR A SORT OF A KIND...
Introductory chatter...
Here's one you'll use every day (grin).
This is a sort, a barely-adequate nevertheless robust kind of a
sort. It's A Sort of a Kind (hereafter "aSoaK" for brevity's sake).
It's robust in that it will sort arbitrarily large files. It's
barely-adequate because it implements only the bare minimum of
features I envision for a useful sorting utility. I produced it
this way, in this form, because I, personally, need something now,
and because I don't have time to do more than this right now.
In fact, during testing, I hit upon a much faster scheme for doing
the same job, but I don't have time to go back and rewrite, so this
is shipping the way it is. If you're feeling like a poor relation,
don't. This is "barely-adequate" only in contrast to what I _wish_
I had time to do. It's better than anything else I've seen -- and
my natural preference is to make do with someone else's tools
rather than write my own. In particular, aSoaK succeeds where (for
example) the sort in Word goes south, and it sorts to a more
reasonable order. Moreover, my own results (one tester reports
contrary evidence) indicate that it is about 3 times faster than
Word on larger files (it's amazingly faster on small files). So:
while I can imagine a much better sort in terms of both features
and performance, this will (and will have to) do for now...
System stuff...
IMPORTANT: aSoaK is System 7 or above ONLY. Its only interface is
Drag and Drop, so you cannot use it with earlier Systems. If you
double-click on it from a System 6 machine, you will get an error
message and the software will quit gracefully.
To use aSoaK _with_ System 7 or above, simply select the files you
want to process and drag them on the program's icon or an alias of
it. New files with the extension ".SRT" will be created, and your
original source files will remain unaltered.
Usage notes...
A Sort of a Kind will sort arbitrarily large text files. File
size is limited only to allocated RAM, with the reliable factor
being file-size + 128K. If you get a message that your file is too
large, increase aSoaK's memory allocation like this:
1. Get Info on the file. Note its size in kilobytes.
2. Add 128K to the size of the file.
3. Get Info on aSoaK and change the figure in the Preferred Size box
in the lower right-hand corner to the value you came up with in step
2.
If you ever actually do get a file that's larger than all the
memory you can give aSoaK, drop me a note and I'll see if I can
come up with something.
Practically speaking, the length of any one paragraph must be less
than 16,000 bytes, and warning for exceeding this limit is built
in. The chances of hitting this wall are even slimmer than having a
file too large to fit in memory, I expect.
The default sort order is strict ASCII (case is sensitive, and
accented characters are not equated to their unaccented forms). In
addition, there are three modifier keys that you can use at Drag &
Drop time to influence the sort:
1. Shift-D&D induces a (quasi-)lexical sort. Accented characters
are equated to their unaccented equivalents, and case is
(semi-)ignored by a double-weighting scheme. In other words, case
(ignoring accents) is taken into account _only_ where the
characters are exactly equal (ignoring accents) when case is
ignored. Like this:
Case-sensitive: M < N < m
Case-insensitive: m = M < N
Double-weighting: M < m < N
With a case-insensitive sort, there is no prohibition against "m"
coming before "M". Double-weighting cures that ill. Take note,
however, that accented characters are double-weighted _only_ as to
capitalization. Accented words are not ghettoized nor even
(semi-)ghettoized for being accented. If you Shift-D&D the enclosed
file called "Alice & Bill Toy With Names", you'll see how
(non-ghettozing) double-weighting works.
2. Command-D&D simply yields the sort in reverse order.
3. Option-D&D causes _exact_ duplicate lines to be shown only once.
Case and accents are _honored_ when deciding if a line is a
duplicate.
These three modifier keys can be "stacked". That is, you can hold
down more than one, and each will be honored. So, for example,
Shift-Option-D&D will produce a lexical sort with exact duplicates
shown only once.
If you have used any of the modifier keys, the progress window will
tell you which ones.
If you make a mistake or if you're tired of waiting for aSoaK to
finish, holding down Command-Period will abort the process.
A Sort of a Kind in Real Life...
aSoaK and Word: Word sorts by means of a (to my mind) excessively
complicated mathematical weighting scheme. This is why it produces
such odd results so slowly and why it breaks down with large or
hirsute files. aSoaK's lexical sort (Shift-D&D) produces what Word
is striving for, faster and cleaner, without the size and
complexity limitations.
aSoaK and Idiot Randomizer: Idiot Randomizer was built for Word's
(goofy) way of sorting, so aSoaK's results will not be identical.
They will, however, achieve the same end, which is why I didn't
build randomizing into aSoaK.
aSoaK and FontFischer: It was FontFischer that made me realize that
I couldn't make do with what I had. Word chokes on large
FontFischer files, and I have need to make a _big_ font book.
aSoaK and Torquemada: Torquemada the Inquisitor saves me a lot of
time, more than you might ever guess. My idealized sort would
permit you to, for example, weight the fields of database report
records, so that a report like this:
"Greg","Swann","70640,1574"
"Shane","Stanley","100033,317"
"Kip","Shaw","72320,1301"
could be sorted on the _last_ name, rather than the first. Users of
Xdata will see the utility of this kind of power at once. But: we
ain't there yet, and I won't have time for IdealSort for quite some
while. However: here's where Torque saves me (and you) a lot of
time. This set:
^?","^*","^~^p
^*|^?","^*","^~^p
will prepend a sort key consisting of the last name field to each
record. You would run the resulting TQM file on aSoaK, then use
this set:
^?|^*^p
^*^p
to strip out the sort key. All sorts of analogous stunts can be
pulled, using Torquemada or other text-processing utilities.
Very Best,
Greg Swann
gswann@kagi.com
gswann@primenet.com
USPS: 3608 West Cochise Drive
Phoenix, AZ 85051
8/1/98